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Abstract 

Volume of metric balls relates to rate-distortion theory and packing bounds on codes. In this 
paper, the volume of balls in complex Grassmann manifolds is evaluated for an arbitrary radius. The 
ball is defined as a set of hyperplanes of a fixed dimension with reference to a center of possibly 
different dimension, and a generalized chordal distance for unequal dimensional subspaces is used. 
First, the volume is reduced to one-dimensional integral representation. The overall problem boils down 
to evaluating a determinant of a matrix of the same size as the subspace dimensionality. Interpreting 
this determinant as a characteristic function of the Jacobi ensemble, an asymptotic analysis is carried 
out. The obtained asymptotic volume is moreover refined using moment-matching techniques to provide 
a tighter approximation in finite-size regimes. Lastly, the pertinence of the derived results is shown by 
rate-distortion analysis of source coding on Grassmann manifolds. 


I. Introduction 

A Grassmann manifold is the collection of subspaces of a given dimension in a vector space. 
Grassmann manifolds find many applications due to their relation to eigenspaces of matrices, see 
e.g. in [1-3]. In the context of multi-antenna transmissions, complex Grassmannian codes have 
notably been used for non-coherent space-time coding [4-6] and channel-aware precoding [7, 
8 ]. 
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A metric ball in the Grassmann manifold is analogous to a spherical cap on a sphere and 
evaluating its volume is critical for several performance measures of Grassmannian codes. 
When constructing codes as point-sets with the largest possible minimum distance, the volume 
of a metric ball directly applies to packing bounds on codes [9,10]. Moreover, rate-distortion 
theory on Grassmann manifolds has been extensively applied to channel quantization analysis 
of precoded MIMO systems [11-14]. In this source-coding context, the volume of metric ball is 
closely related to the cumulative distribution of quantization errors for a uniformly distributed 
source. 

Computing the volume of metric ball in manifolds is recognized to be a difficult task. For 
the Grassmann manifold, this was addressed in several previous works. The volume of a metric 
ball in generic Grassmannians was derived for line packing with in [8], whereas asymptotic 
evaluations for arbitrary subspace dimension were provided in [9], and in [15] for balls with 
different dimensional center. A small ball approximation was considered in [10] which was 
later derived exactly in [13] for balls of radius less than one. More recently, an exact volume 
formula for packing (2D) planes has been derived with application to massive MIMO [16]. While 
the range of validity corresponding to the small ball volume has an exponential explosion in 
codesizes with large dimension, the known asymptotics show slow convergences, providing only 
asymptotic scaling laws. 

The goal of this paper is to provide an accurate but simple volume approximation for regimes 
not covered by previous works. The obtained asymptotics is complementary to the result in [13] 
and with faster convergence than in [9,15]; it is however limited to the complex case and its 
derivation is not straightforwardly generalizable to real Grassmannians. We consider the case of 
possibly unequal dimension between elements in the ball and the center of the ball as in [13, 
15]. Related problems with subspaces of non-equal dimensions arise for example in [6,17- 
19]. We start by discussing generalizations of the well-known Grassmann chordal distance [1] 
for subspaces of different dimensions, as well as relevant symmetries of the volume of ball 
with our choice of distance. Then, from the known representation of the volume of ball as 
a multi-dimensional integration over principal angles [9,13,20,21], we reduce the problem to 
a one-dimensional integral related to a Fourier transform. The formulation is valid for any 
radius, and reduces the problem to the evaluation of a determinant that only depends of the 
dimension parameters. Accordingly, this integral can be computed exactly with fixed parameters 


August 4, 2015 


DRAFT 



3 


and we provide several examples illustrating its versatility. The exact volumes have different 
polynomial representations in different ranges of integer part of the squared radius. Radius less 
than one [13] is itself a specific regime. From this, it can be anticipated that even though 
an exact generic formula could be derived, it will most probably be a linear combination of 
special functions similarly than in [16]. Such representation may then not be very amiable from 
application perspective, as for example one often needs to invert the volume to computes bounds 
on codes. 

To provide a good and relevant approximation for large-dimensional Grassmann manifolds, an 
asymptotic analysis is further carried out. As all the dimension-related parameters are concen¬ 
trated in a determinant inside the one-dimensional integral reformulation, the problem reduces 
to study the asymptotic behavior of the determinant. This determinant is the partition function 
of the so-called time-dependent Jacobi ensemble. Interpreting it as a characteristic function 
of a linear spectral statistics, its asymptotic Gaussianity can be leveraged from random matrix 
theory [22]. This leads to an asymptotic formulation of the volume of a metric ball in term Gauss 
error functions, which however appears loose in some finite regimes. We provide a finite-size 
correction to the asymptotic formula via the exact moments of the considered linear statistics, 
leading to a tighter volume approximation while preserving the simplicity of the asymptotic 
form. 

Finally, the derived asymptotic formula of the volume of metric ball (with its finite-size correc¬ 
tion) is applied to rate-distortion theory on Grassmann manifolds. With increasing dimensions, it 
is shown to provide a good estimate of the rate-distortion trade-off in a source coding problem. 
The correction is notable compared with using a small ball approximation outside of its regime 
of validity. 

The rest of this paper is organized as follows. Pertinent definitions and properties are given in 
Section II. The volume of metric ball is reduced in Section III to a single-fold integral leading 
to some examples of exact derivations. In Section IV, the asymptotic behavior of volume of 
metric ball is derived and corrected by moment-matching techniques for finite-size applications. 
In Section V, the derived expression is applied to source-coding on the Grassmann manifold. 
The paper is concluded in Section VI. 
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II. Preliminaries 

A. Grassmann Manifolds and Chordal Distance 

The complex Grassmann manifold is the collection of p-dimensional subspaces in an 
ambient n-dimensional complex vector space C”. This is a homogeneous space of the unitary 
group Un as there is a left action of on that acts transitively. A plane P G ^ can 
be described by infinitely many orthogonal bases leading to non-unique semi-unitary matrix 
representation P G such that P^P = Ip, where ()^ is the conjugate transpose of a matrix. 

There are several possible choices to define a distance on the Grassmann manifold. We consider 
the chordal distance [1,2] which is related to an embedding of the Grassmannian to an Euclidean 
sphere, and has been prominently used in the literature [4,7,13,23,24]. The chordal distance is 
well-defined between subspaces with equal dimensions. However, one can find slight variations 
for its generalization to subspaces of unequal dimensions. We will use the same definition as 
e.g. [13,17,19] arising from the concept of the principal angles; discussions on its theoretical 
foundation can be found in the recent work [3]. 

In [3], it is shown that any measure of distance that only depends on the relative position 
between two subspaces must be a function of the principal angles. The collection of principal 
angles provides the relative position between subspaces which is transitive under the action 
of the unitary group. However, compressing this “vector-like distance” to a classical scalar 
distance d, one loses transitivity. Grassmann manifolds are not in general two-point homogeneous 
spaces [25]: one cannot necessarily find a unitary mapping between two pairs of equidistant 
points, i.e., a pair (P, Q) cannot always be mapped to a pair (P', Q') even if d{P, Q) = d{P', Q'). 

Consider two integers p and q satisfying p,q < n and m = mm{p,q). Given P G G'^p 
and Q G G'^g with respective orthonormal bases P G P, Q G Q one can define m principal 
angles [20] between these two subspaces. We denote the principal angles hy 6i... 6^ ^ [0, |]. 
They are independent of the choice of coordinates and can be computed via the singular value 
decomposition of P^Q whose singular values are {cos6*j}™^. The considered square chordal 
distance is given by 

min(p,g) 

dl{P,Q) = ^ sin2(4) 

fc=i 
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= min(j9, q) 
= min(j9, q) 


min{p,q) 

^ 003 ^( 4 ) 

k=l 


p^Qwy 


( 1 ) 


B. Relation to Other Chordal Distances 

It is noted in [3] that dc gives a notion of distance in a sense of a distance from a point 
to a set, but it does not give a metric between subspaces of different dimensions since two 
distinct subspaces of different dimensions can have distance zero. Nevertheless, a simple variation 
leading to a properly-defined metric function is given in [3]. This is obtained by assigning \p — q\ 
additional principal angles with value | for the dimensions mismatched between P and Q. One 
can then define 

min(p,g') 

dl^{P,Q) = max(p,g)- ^ cos^(6'fc). (2) 

k=l 

The chordal distance has also been generalized to subspaces of different dimensions from 
their corresponding projection operators in [6]. This corresponds to the Euclidean distance of a 
spherical embedding into each Grassmannian being itself embedded in a different 

cross-sectional sphere, specifically the p-dimensional subspaces to and the q- 

dimensional subspaces to [1,10]. This gives a proper metric which can be 

expressed in term of principal angles as 

4.(P,Q) = 

min(p,q') 

= p + q-2 ^ 003^(4). (3) 

k=l 

Finally, we suggest a third metric for subspaces of unequal dimensions which provides a slight 
reduction in the dimension of the embedding. The main observation is that all Grassmannians 
in n dimensions can be embedded in a single sphere This holds in fact for any flag 

manifold as well [26]. To obtain this, one must detrace the projectors and rescale them as 
P = {PP^ — ^I) and Q = — fT), then P and Q lie on the same unit 
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sphere The eorresponding Euelidean distanee is 


di{p,Q) = IIP-gill 

= i^i-i^2i|p^g|ll 

min(p,g) 

= cos2(0g (4) 

k=l 


with Ki = 2 + 2 


pq 


(n-p)(n-g) 


and Ko = 


2n 


y/pq{n-p){n-q) 


All the distanees dc, dc#, -^dc* and -^^dc* reduee to the elassieal definition of ehordal 
distanee for p = g. In the rest of this paper, we will keep the definition of the distanee dc as 
in (1) due to its eompaetness in term of prineipal angles and for eonsisteney with [13,17,19]. 
Corresponding results ean be easily extended to the other distanees eonsidered above sinee dc 
ineludes the main information of interest, and differs only by eonstant faetors from the other 
metries. 


C. Metric Ball and Normalized Volume 

Define the metrie balls of g-dimensional subspaees with distanee at most r from the p- 
dimensional eenter P G ^ by 

BpAr) = {Qeglc, ■■ 4(P,g)<r}. (5) 

The ball Bp,j{r) is a subset of though it is defined with referenee to a point in 

We eonsider the invariant Haar measure /i, defining an uniform distribution on For any 

measurable set S C G^p and any U G Un, the Haar measure satisfies 

p{US)=p{S). (6) 


The quantity p,{Bp^q{r)) is independent of the eenter P, and we will simply write p,{Bp^q{r)) or 
even p,{B{r)) when there is no ambiguity. 

The invariant measure ean be interpreted as a normalized volume 

vol(P,,,(r)) 




(7) 


where with our ehoiee of distanee the eorresponding volume of the Grassmann manifold is [27] 


vol(e„=,) = ir’l"-*' n 


{q-i)\ 


2=1 


[n — 1 ]\ 


( 8 ) 
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It is shown in [3] that distances from principal angles as considered here are independent of 
the dimension n of the ambient space. The choice of ambient space, however, has an impact on 
the maximum possible value of dc and thus the range of fx. From [13, Lem. 2] it can be deduced 
that the chordal distance must satisfy dc{P, Q) < dmax with d!^^^ = min(p, q,n — p,n — q). As a 
consequence fi{Bp g{r)) is defined on the range [0, dmax], its maximum is /r(i?p,g((imax)) = 1, and 
the volume depends on n. The dependence on the choice of ambient space can be understood 
by considering e.g. the problem of packing 2D real planes. It is impossible to find two fully 
orthogonal planes in (i.e. they intersect only in the zero vector), while it is possible in 


D. Symmetries and Complementary Balls 

Without loss of generality we will assume all along the paper that p < q < n and p + q <n, 
implying that p < n/2. Results in other parameter ranges can be reproduced using the chordal 
distance and the canonical isomorphism = Gn,n-p- 


diBp/r)) 


( 9 ) 

diBp/r)) 

= p{B„_p^^_/r)). 

( 10 ) 


These symmetries were used for volume computations in [13] as for p + g > n, one can evaluate 
p,{Bp/^p/{r)) with p' = n — p and q' = n — q satisfying p' + q' < n. 

An additional symmetry than can be used for extending results from one Grassmannian to 
another with a different another range of radius values is as follows. Let p, q satisfy p < q < n 
and p + g < n. Then we have 


d = 1 - /i (^Vp^^)) 


( 11 ) 


The proof is in Appendix A. For the specific case g = |, one sees that the volume is a 
symmetric function in r = s/p/2 since fx{B{r)) = 1 — jx{B{s/p — r^)). Combining (11) and 
the result of [13] directly leads to the following elementary exact evaluation of volume of balls 
for any radius with n = 4, p = g = 2: 




( 12 ) 


for r < 1 
1 — i(2 — r^)^ for r > 1 
The symmetry (11) is illustrated in Figure 1. The exact evaluation for p = g and r < 1 in [13] 
is highlighted in red. It can be directly used for computing the volume for q = n — p and 


r > i/p — 1. 
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Fig. 1: Illustration of the symmetry (11) between /i(i?p ,j(r)) and one eurve is the 

rotation of the other by 180° around the median. 


E. Sphere-Covering/Packing Bounds 


When subspaees have the same dimension p = q, a. direet application of the volume of 
metric ball occurs in the evaluation of fundamental coding bounds. The Gilbert-Varshamov and 
Hamming bounds, derived from a sphere-covering and sphere-packing arguments, respectively, 
relate the code’s cardinality to its minimum distance. Namely, for any distance 5, there exists a 
code C with cardinality \C\ such that 


H(BW) - 

while for any (|C|, 5)-code C C one must have 


|C|< 


1 


(13) 


(14) 


III. Exact Integral Formulations 


The volume of a metric ball in the Grassmann manifold is known to be expressible as a 
multivariate integration over principal angles. Here we reduce the problem to a one-dimensional 
integral related to a Fourier transform. It is assumed without loss of generality that p < q < n 
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and p + q <n and complementary cases can be treated by symmetry. It is worth noting that as 
a consequence the range of radii of balls is r e [0, ^]. 


A. Multi-dimensional Integration 


An integration of the volume element on the Grassmann manifold can be split in three parts 
including two densities on Stiefel manifolds that can be fully integrated. The overall calculation 
reduces then to an integral over the marginal distribution of the principal angles [9,20]. With 
the cosines of the principal angles c* = cos6*j, i = 1.. .p, the volume of a metric ball p {B (r)) 
in complex Grassmann manifolds can be written as a p-dimensional integral of the form [13, 


20 , 21 ] 


fi {B (r)) = Vn,p,q j A2(c2) (1 - cfj 


2 ( 2 \ 

-'n,p,q I ^ 1 11 ""J 

0<Ci<l, 


2 \n-p-q ,2 


dCi, 


(15) 


where the normalization constant Vn,p,q is given by 

^ r(n-j + l) 


'^n,p,q — 


n r (j + 1 ) r (n - g - j + 1 ) r (g - j + 1 ) ’ 


(16) 


and 

A(c) = det = JJ (c^ - cj) (17) 

l<2<ji<p 

denotes a Vandermonde determinant. We note here that the volume element is unique up to 
a scaling factor (which is included into the overall normalization) and the choice of distance 
affects only the domain of integration. 

Applying the change of variables Xj = 1 — cj, Eq. (15) simplifies to 

/i {B (r)) = Vn,p,q J (1 - Xjy~^ dXj (18) 

0<Xi<l, 

where A(a;) = det = ni<i<j<p “ ^j) similarly as in (17). The constraint A 

in the integral (18) presents the main challenge to obtain an explicit expression. 
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B. Time-dependent Jacobi Ensemble 

To address the issue raised above, we rewrite the integral (18) by using an indicator function: 

,.2 

P , P 


p {B (r)) = ti, 


^,P,g [ 


X 


n—p—q 

3 


(1 — XjY ^ dxj dt. 


(19) 


0 0<a;i<l 

Here (5(-) is Dirac delta function, which admits the following Fourier representation 

1 


6{t — a) = 


2tt 


d{t-a)u 


( 20 ) 


Note that a similar idea of using an indicator function was considered in [28] for evaluating 
volumes in the unitary group. 

Inserting (20) into (19) and performing the integration over t, we arrive at 




2ir 


u 


1 -e" 


Dp{o) dz/, 


where 


Dpiy^ — ^n,p,q 




x^ " ‘{1 — Xj)‘ ' e 


n-p-q 

S' 




( 21 ) 


( 22 ) 


i=i 


Comparing (18) to (21) and (22), we see that the reformulation amounts to eliminating the 
constraint ^1 Ih® expense of introducing a deformation in the p-dimensional 

integral (22). As such the main difficulty is now concentrated in evaluating Dp{u) which is 
independent of the radius and only depends on dimension parameters. The integral in (22) is the 
partition function of the so-called time-dependent Jacobi ensemble [29], which is the classical 
Jacobi ensemble deformed by 


C. One-dimensional Integral Formula and Exact Evaluations 

It is possible to further simplify the volume formula to a one-dimensional integral. To proceed, 
we use the Andreief integral identity [30,31], see Appendix B, as well as the symmetry of the 
integrand to simplify the integral to 



= V^-Vn,p,qdet{B{a,l3) iFi (a, a +/3;-iz/)), (25) 
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where the last equality is obtained by [32, Eq. 3.383] with 

a = i+j + n — p — q — 1, (26) 

/3 = q-p + 1. (27) 


Here 


B{a,/3) 


r(a)r(/j) 
r(a + /3) 


(28) 


is the Beta function and 


iFi {a,/3]x) 


oo 

E 

fc =0 




W)kk\' 


(29) 


defines the hypergeometric function, where {a)k = r(Q; + A;)/r(Q;) is Pochhammer symbol. 
Note that the step from (23) to (24) using Andreief identity could not be generalized to compute 
volumes in real Grassmann manifolds. This is because the Vandermonde determinant in (23) is 
not squared in the real case [13]. 

Putting everything together, we obtain an integral representation of p {B (r)) for any radius, 

p {B (r)) = f — fl — det {B (a, l3) iFi (a, a +/9; —iz/)) dz/. (30) 

27r u \ / 

For the special case p = q = n/2, the above general result simplifies further to 


p{B{r)) = 


pm 


n,p,q 


- 1 


27r 


iz/ 


|p2 + l 


det r(f + j — 1) (1 



(31) 


The main technical difficulty in the single-integral formulation (30) and (31) lies in computing 
a determinant of a p x p matrix. This is tractable and can be further carried out for specific small 
values of the parameters n, p and q. We list some of the results in Appendix C. The obtained 
expressions are verified to match Monte Carlo simulations in Fig. 2 for different values of n, p and 
q. Fig. 2 also includes the asymptotic approximation derived and discussed in the next section. 
As it can be seen, balls of radius less than one cover a very limited portion of the space with 
large dimensions. For example with (n,p, g) = (8,4,4) the radius contraint r < 1 corresponds 
to balls covering a maximum of 0.000042% of the space which corresponds, according to the 
Gilbert-Varshamov bound, to code with at least 2.4 x 10"^ elements, or equivalently 14.5 bits. 
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0 


0.5 


1.5 


2 


2.5 


Fig. 2: Volume formulas (30) and (31) (Appendix C) for /x(i?(r)) versus simulations and 
asymptotic approximation (57). 


IV. Asymptotic Analysis 


As shown in the previous section, it is possible, for any radius, to derive exactly the volume of 
metric balls with specific values of n,p,q. However, the resulting expressions are cumbersome 
in large dimensions. In this section, starting from the reformulation (21), the volume is analyzed 
through asymptotics of the determinant Dp(u), providing good and relevant approximations for 
large-dimensional Grassmann manifolds. Again, we assume p < q < n with p + q < n and other 
cases can be treated by symmetry. 

A. Asymptotic Volume via Random Matrix Theory 

For reasons that will become clear later, we consider the linear transforms yj = 2xj — 1, 
j = 1,... ,p in the integral (22). After calculating the jacobians associated with the transforms, 
we have 


Dp{u) = e'^PDp{o) 


(32) 



(33) 
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where 


,-5 _ n-p{n-p) 

^n,p,q ^ ^n,p,q‘ 


( 34 ) 


One can interpret Dpijj) as characteristic function of the random variable 

y = ZI ( 35 ) 

i=i 

over the so-called Jacobi ensemble 

f{y) = Vn,p,q^^{y)W0- - yjy~^0- + ■ (36) 

i=i 

In this form, the asymptotic behavior of the linear spectral statistics (35) is a well-investigated 
subject in random matrix theory. Specifically, by using the result [22, Th. 3.2] straightforward 
manipulations^ show that (33) converges to 

bp{v) = 


E 


• { n — 2q\ 

)“ 32 


(37) 

(38) 


in the regime 


n,p,q ^ oo, with fixed q — p and n — p — q. 


(39) 


To wit, in the asymptotic regime (39) the random variable (35) follows a Gaussian distribution 
with mean and variance read off from (38) as 


EM = ^. v|y] = T 


(40) 


This is a central limit theorem for the linear statistics (35) of the Jacobi ensemble (36). By the 
relation (32), we have 


' n+2p — 2q 


D„{u) ~ (41) 


Inserting this into (21) an asymptotic representation of the volume of metric balls is obtained as 


1 /‘°° i 

J — OO ^ 


- (1 - 


' n+2p —2g N _ 


32 dz/. 


(42) 


'Namely, with the notations in [22], g{x) = —i^x is a linear combination of only the first-order Chebyshev polynomial, then 
by identifications a = q—p, b — n—p—q, and ci = —if. Theorem 3.2 from [22] gives logE [exp(y] g{yj))] — >■ |cf — i(a—&)ci 
as p —)■ OO and where the expectation is over f{y) in (36). 
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The imaginary part of the integrand is an odd function which integrates to zero. The real part 
is even, from which an asymptotic volume formula is obtained as 


/i {B (r)) ~ ierf (^2\/2aj - ^erf ^2\/2 [a - j , 


(43) 


where 



{n + 2p — 2q) 


(44) 


and 

2 

erf(a;) = —= / dt (45) 

Jo 

is the Gauss error function. 

Although the derived volume formula (43) is asymptotically tight in the regime (39), it may 
not be very accurate when used as a finite-size approximation. The asymptotic (38) is obtained 
by letting the size p of the product in the Jacobi ensemble (36) grow to infinity, while n, g —)■ cxd 
keeping the exponents a = q—p and b = n—p — q fixed in (36). As it will be observed below (see 
Figures 3 and 4), the convergence to the asymptotic distribution is slower when either \a — h\, 
a or 6 is large, leading to poor approximations with small p. This fact motivates us to find 
finite-size corrections to the asymptotic mean and variance (40) while preserving simplicity of 
the form (43). 


B. Finite-size Corrections via Exact Moments 


The idea here is to use exact moments of the linear statistics Y to construct a volume 
approximation instead of using the asymptotic ones (40). In consistence with the asymptotic 
Gaussianity in (38), we consider a Gaussian approximation of the random variable Y using the 
first two moments. The exact moments of Y can be recursively obtained via the connection 
between the moment generating function and an ordinary differential equation. Specifically, the 
moment generating function of F = yj/‘^ given by 

Mp{o) = Vn,p,q /'•••/ n (Jyj: (46) 


where 


a = q — Pi b = n — p — q 


(47) 
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and Vn,p,q is as defined in (34). By definition, we have 

logMp(z/) = (48) 

i=i 

where Hj denotes the j-th cumulant of Y. The derivative of cumulant generating function (48) 
satisfies, up to a translation in v, a nonlinear second-order differential equation [29], which in 
our notations reads 


(2z/(t"(z/))^ = {aiy) — ua'{v) + 2{2p + a + h)a'iy)) 




+4 (a(z/) - i>a\p) - p{p + b)) ^(2(t'(z/))^ - 2aa'{iy)^ , 


where 


M = EUy3T)! ■ T + ■ 


Inserting this into (49), the cumulants can be calculated in a recursive manner. The first three 


cumulants are 


Kl = 


pin — 2q) 


2n ’ 

pqin-p)in-q) 

^2 2(2 1 ^ 

[n^ — Ij 

2pqin - 2p) (n - 2q) (n - p) (n - q) 

^3 3/'4P;2l/l^ ’ (53) 

(n^ — on"* + 4) 

where we have substituted the parameters according to (47). Now we approximate the random 


K3 = -- 


variable F by a Gaussian with mean and variance 


E[F] = «q, V[F] = «;2, (54) 

so that the corresponding moment generating function is approximated by 

Mpiv) ^ (55) 

Comparing the moment generating function (46) and the characteristic function (33), we have 

bpin) ^ (56) 

Following similar steps that led from (41) to (43), we arrive at a finite-size volume approximation 


piB (r)) ~ -erf 


-\/2k,2 / 2 \ yJ2K2 ) 
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where 


/9 = 


p{n — q) 


K2 = 


pq{n — p){n — q) 


(58) 


n {v? — 1) 

Recall that (57) is guaranteed to be asymptotically tight in the regime (39) according to the 
asymptotic Gaussianity of Y. One can also verified with the change of variables a = q — p, 
b = n — p — q and by letting p ^ oo that the mean and variance in (54) are 

(6 — a)p {h — a) n — 2q 


E[F] = 

v|y] = 


2(a + b + 2p) 4 4 

p{a + p){b + p){a + b + p) 


1 

16 


(59) 


(60) 


{a + b + 2p)2((a + b + 2pY — 1) 

matching (40) as expected^. From (59) and (60) , one sees that the larger a, b, the slower the 
convergence of the mean and the variance to their asymptotic values would be. 


C. Simulations 


In Figs 3 and 5, we plot the volume of metric ball (15) calculated by the random matrix theory 
(RMT) approximation (43) as well as the finite-size approximation (57). As a benchmark, we 
also provide volume curves by Monte-Carlo simulations. The finite-size correction curves are 
also included in Fig. 2 for comparison with exact expressions. In Fig. 3, asymptotic behavior 
with the regime (39) can be observed, i.e. letting n, p, q grow with fixed q — p and n — p — q. 
One verifies that convergence occurs for both approximations, while it can be observed to be 
much faster for the finite-size correction curves in all cases. The convergence rate of the RMT 
approximation is dependent of constant values a = q — p and b = n — p — q. 

To evaluate the convergence between the two asymptotic approximations, we compute the 
divergence between two Gaussian distributions Yi and Y 2 with means /ii, /i 2 and variance 
as given in (40) and (54), respectively. The Hellinger distance is a type of /-divergence which a 
frequently-used metric for the spaces of probability distributions [33]. Accordingly, the distance 
between two Gaussian distributions is [34] 


H{Yr,Y2) = ^1-BC{Y^,Y2) 


where 


BC{Y^,Y2) 


/ 


2(7i(72 

aj + al 


4 cr^+cr 


(61) 

(62) 


^It can be as well verified that the third cumulant is asymptotically canceling tts —>■ 0. 
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Simulation 




Fig. 3: Volumes of metric ball with RMT approx. (43) versus finite-size approx. (57) in the 
regime (39). Each graph corresponds to a fixed value of a = q — p and b = n — q — p: 
(a, b) = (0, 0), (1, 0), (0,1) and (3, 3). For every graph, curves from left to right corresponds to 
p = 2,3,4. 


is the corresponding Bhattacharyya coefficient. The distance (61) is displayed on Fig. 4 as a 
function of p for fixed values of a and b. For all cases, the Hellinger distance is converging to 
zero as expected. A slower rate of convergence is nevertheless verified for larger a, b, or \a — b\, 
and it can be observed that for the maximum considered value p = 30, the RMT approximation 
is at more than half of the maximum distance from finite-size approximation in the highest 
dimensional cases. 

In Fig. 5 we consider cases of fixed p and growing values of n and q. In Fig. 5a p = q = A 
and n = 8, 9,10, it is seen that the RMT curves shift horizontally away from the simulated 
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Fig. 4: Hellinger distance between two Gaussian distributions with moments given in (40) 
and (54), for different fixed values of a = q — p and h = n — p — q as a function of p. 




-RMT approx. 



(a) {n,p,q) = (n,4,4) (b) {n,p,q) = {2q,2,q) 

Fig. 5: Volumes of metric ball with RMT approx. (43) versus finite-size approx. (57). Comparison 
with fixed values of p and growing values of n, g. 


ones as b = n — 2p inereases. This means a shift in the mean value if we interpret the curves 
as CDFs of volume density. Simulations indicate that for p = q the RMT approximation (43) 
incurs a nontrivial loss in the mean when the difference b = n — 2p is greater than zero. In 
Fig. 5b, we eonsider the case when n = 2q with q = 2,4,6 and 8 in elock-wise order for a 
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fixed p = 2. It is observed that the RMT based curves rotate away from the simulated ones as 
a = q — p increases. While here a = b and the means of both approximations are equal in (59), 
the RMT approximation (43) fails to capture the variance of volume density as q increases. 
For all the cases considered in Fig. 5, the finite-size approximation (57) matches the simulation 
almost exactly. 

Intuitively, the RMT approximation (43) fails to capture the volume curves since the corre¬ 
sponding asymptotic mean and variance (40) do not involve all the possible parameters p, q and 
n. In particular, the variance (40) obtained by RMT is a constant. On the contrary, the mean 
and variance (54) used to construct the finite-size approximation (57) are functions of all the 
parameters. 


V. Application to Source Coding on Grassmann Manifolds 


The asymptotic volume of a ball with its finite size-correction can be applied to evaluate the 
rate-distortion trade-off of a source quantization in large-dimensional Grassmann manifolds. 

Given a code C = {Gi,..., C^} C Gn,p with size N = \C\, consider a uniformly distributed 
source on quantized to C using the chordal distance dc as a quantization map: 




Q H- aig min dc{Q,Ck). 

C'k&C 


(63) 

(64) 


A source code is considered optimal if it minimizes the average distortion of the process, i.e. 
the average square quantization error 


D{C) 


EQeecJmmd2(C'fc,g)] 

fp 

/ zdFc{z) 


(1 - Fc{z))dz, 


(65) 

( 66 ) 
(67) 


Jo 

where Fc{z) = Pr{g | mink dl{Ck, Q) < z} is the CDF of quantization error. The distortion- 
rate function is the infimum of all possible distortions for a given codesize. 


D(N) = inf D(C). 

|C|=A 


( 68 ) 
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The quantization map leads to the partition of the Grassmann manifold into Voronoi cells 
centered around the codewords which are defined as 


V, = {Q \ dl{C,,Q)<dl{Cj,Q),Wj}. 


(69) 


It follows that the probability of quantization error being less or equal to a value 2 : is given by 


Pr{0 

min dl{Ck,Q) < z\ 

k J 

(70) 

Pr{uJ'. 

i{0 6(Bc,(x/2)nPt)}}. 

(71) 


As points on the manifold belonging to the Voronoi cells’ borders appear with probability zero, 
we further have 

N 

Fc{z) = 5^Pr{5c,(v/i)nV4 (72) 

k=l 

N 

= I]'‘Ptv(y2)nV4). (73) 

k=l 

The CDF of quantization errors (73) is actually equal to the volume of balls until some 
border effect, i.e. Fc{z) = N^{B{^/z)) on the interval [0, q^] where q is the kissing radius of 

the code [35]: the shortest distance from a codeword to the border of a Voronoi cell. In general, 

for any k, {Bc^{\fz) H 14 ) C Bc^{^/z), and so for any 2 : one has Fc{z) < Nfi{B{^)). 

As shown in [13], by using this upper bound to design an ideal distribution F^{z) = Nfx{B{y/z)) 
on [0, 2 ;*] such that Njj,{B{\/^)) = 1 leads to a lower bound on distortions, i.e. D{N) > 
Jq zdF^{z). Following this principle, after estimating 2 ;* by inverting the derived volume (57) 
from the previous section and by direct integration we obtain the following approximation to 
the rate-distortion trade-off 

D{N) >/3- j , (74) 

which is asymptotically a lower bound for large-dimensional codes as n, p, g —)■ cx) with fixed 
q — p and n — p — q. The parameters (3 and k ,2 are given in (58). 

The approximation (74) is compared to simulations in Fig. 6 for different values of n, p, 
and with q = p. Codes with cardinality between 2 and 256, i.e., between one and eight bits, 
are considered. Rate-distortion trade-offs have been numerically minimized by applying vector 
quantization based on Lloyd’s algorithm. In addition to the numerical vector quantization results. 
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Large-dimension lower bound (74) 
Large-codesize lower bound [13] 



Fig. 6: The large dimension bound (74) eompared to simulated average distortions and high- 
resolution bounds [13]. 


averaged distortions of random eodes are shown, whieh by construetion provide an upper bound 
on D{N). It is visible that the derived approximation (74) is asymptotieally a lower bound 
in large dimensions, while obviously not for the smallest ease {n,p) = (2,1). For all other 
dimensions, it provides a rather good approximation for every eardinality. 

The large-dimension bound (74) is further eompared to the high-resolution bounds in [13, Th. 
2]. The results in [13] are asymptotics in a different regime with a range of validity given a 
sufficiently large codesize. A necessary condition can be found in [13]. Here, the lower bound 
for code size where the results [13] start to apply would be 0, 1, 5.4, 14.6, 27.1, 68.6 and 114.1 
bits for the cases {n,p) = (2,1), (4,2), (6,3), (8,4), (10,4), (16,4) and (16,8), respectively. 
These are plotted in Fig. (74) from bottom up.The high-resolution bounds in [13] provides also 
very good approximations of the rate-distortion trade-off for much smaller codesizes, outside of 
their given range of validity. Nevertheless, in the large-dimensional regime and with fixed code 
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cardinality, one can observe in Fig. 6 a trend in the slope of the rate-distortion trade-off which is 
not captured by the high-resolution bounds. The bound (74) provides a good approximation in 
almost all cases. This is the consequence of the volume evaluation (57) for any radius. Moreover, 
the bounds in [13] depends on a fastly-decreasing constant Cn^p^q —)■ 0 as n, p, g —)■ cx) which 
might lead to numerical computation issues in the large-dimension regime. For example, its value 
is Cn,p,q = 4.2 X 10“® for (n,p) = (8,4) and Cn,p,g = 4.5 x 10“^® for (n,p) = (16,8), while we 
faced numerical computation errors for (n,p) = (32,16) due to machine precision. 

VI. Conclusion 

We evaluated the volume of a metric ball in Grassmann manifolds. The case of a center with 
mismatched dimension is considered and accordingly we discuss generalizations of the chordal 
distance to subspaces with unequal dimensions. First, a new symmetry property of the volume of 
a metric ball is presented. Then, multivariate integration of the volume of a ball with any radius 
is performed. We reduce the multivariate integration problem to a single-fold integral related to 
Fourier transform. We also provide explicit examples in small dimensions for any radius from 
the obtained formula. For large dimensions, the derived integral expression provides a tractable 
starting point for asymptotic analysis. From the asymptotic behavior of the time-dependent Jacobi 
ensemble and by moment-matching techniques, we provide a simple asymptotic volume formula 
which provides a tight approximation in finite-size dimensions. This allows us to precisely 
quantify the rate-distortion trade-off of source coding problems in large-dimensional Grassmann 
manifolds. 

The results presented in this paper are valid for the Grassman manifold over the complex 
field. Using the same methodology for a generalization to the real field does not appear trivial. 
The exact volume formulas were explicitly derived via the Andreief identity. The Andreief 
identity is also intrinsically connected with the asymptotic analysis presented here. By using this 
identity, one is able to relate the problem to a Hankel or Toeplitz determinant whose asymptotical 
behaviors have been extensively studied in statistics. In order to proceed with Andreief identity, 
the square of the Vandermonde determinant in the volume element is instrumental. In contrast, 
for the real Grassmann manifolds, the Vandermonde determinant in the volume element is not 
squared and one cannot thus directly reduce the problem in a similar fashion. It remains thus an 
open problem for future research to identify comparable accurate approximation techniques for 
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the real case. 


Appendix A 

Proof of Symmetry Refationship (11) 

Given P G with orthogonal complement P-^ e Gn,n-p^ similarly given Q G with 
orthogonal complement G Gn,n-q^ has 

p = \\P^Q\\l+\\P^Q^\\% (75) 

which leads to 


dliP,Q) = p-dliP,Q^). 

(76) 

Given a point Q G G^g such that dc{P,Q) > r. 

it follows that dc{P'^,Q) 

< a/p-p^. 

and thus Q ^ Bp,j{r) implies Q G Bp± ^{^^p — r"^). 

Reciprocally Q G Bpg{r) 

implies Q ^ 

Bpx ^(a/p — r^), so that 

Bp,,{r)nBp±^{y/p-r^] 

) = 0 

(77) 

Bp,,{r)U Bp± ^{^p-r‘^] 

) = G^ . 

(78) 

Finally, 

KBp,,{r)) + fi{Bp± 

1 

to 

(79) 


and using (9), (10), we obtain (11). 


Appendix B 

Andreief integral [30] 

For two nxn matrices A(x) and B(x), with the respective ij-th entry being functions Ai{xj) 
and Bi{xj), and a function /(■) such that the integral Ai{x)Bj{x)f{x) dx exists, the multiple 
integral of the product of the determinants can be evaluated as 


'v 


det (A(x)) det (B(x)) J^/(a;i)da;j = det / Ai{x)Bj{x)f{x) da; j , 


2=1 


(80) 


where V = {a < Xn < ■ ■ ■ < Xi < b}. 
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Exemples of Exact Volume Computation 
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We give explicit expression of /i {B (r)) obtained by (30) and (31). The volumes have different 
polynomial representations between different consecutive integer values of r^, i.e. on the intervals 
[0, 1], [1, 2],..[p — 1, p\. The expressions given here are valid for any e [0, p\. It can be 
verified for r < 1 that the expressions simplify (e.g. to a monomial for p = q) and match the 
results in [13]. 

1) n = 4 and p = q = 2: 




-I+8P 


6r^ + 2r^ 


(r^ — 1)^ (7 — 2r^ + r"^) 

2|r2 - 11 


2) n = 5 and p = q = 2: 




11 _ lllrl I _ 9 nr 6 -L 9r! _ (’•"-l)^(85-33A+6r4+2A) 

2 5 + 2 10|r2-l| 


3) n = h, p = 2 and q = 3.' 


lAB{r)) = -| + *|2-24r‘>+16r«-2f+(fS-Sf + 2f-!±)|r 


6 _ 9 A 
2 


9r^ 

5 


4) n = 6 and p = q = 2: 

p{B{r)) = -f + ^ - 120r^ + lOdr^ - A5r^ + - (^^-OH2i7-92A+ioA+4A+A) 


5) n = 6, p = 2 and q = 3.’ 

p{B{r)) = ^ ^ + 144r^ - 128r6 + 60r® - 12ri° 

6) n = 6 and p = q = 3: 


(r2-l)®(-263+100r2-38A-12r®+3A) 

14|j.2_i| 




6547 I 19683r2 _ 6561r4 , 790.^6 _ 729r8 , 243rl° _ 97.^12 , 27r^^ _ 9r)f^ , 

28 28 7 ^ 2^2 2,1 I -t- ^ 28 ^42 

6(r2-l)’’-9|r2-l|6-I^|r2-l|8-^|r2-l|l0 ^ 6(A-2)’^+9|r2-2|® + I^ |r 2 - 2 |®+A |r 2 _ 2 | 1 ° 
P^Tj ^ [72^21 
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